Classifying patents based on their semantic content

نویسندگان

  • Antonin Bergeaud
  • Yoann Potiron
  • Juste Raimbault
چکیده

In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Structure in Design Databases Through Functional and Surface Based Mapping

This work presents a methodology for discovering structure in design repository databases, toward the ultimate goal of stimulating designers through design-by-analogy. Using a Bayesian model combined with latent semantic analysis (LSA) for discovering structural form in data, an exploration of inherent structural forms, based on the content and similarity of design data, is undertaken to gain u...

متن کامل

Ontology matching for patent classification

Interdisciplinary research and development projects in medical engineering bene t from well selected collaboration partners. The process of nding such partners from often unfamiliar elds is di cult, but can be supported by an expert pro le that is based on patent analysis and classifying the patents to competence elds in medical engineering. Patent analysis and categorization are di cult and re...

متن کامل

Automatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach

In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...

متن کامل

Image Semantic Classification Using SVM In Image Retrieval

There is a gap between low-level descriptions of image content and the semantic understanding of users to query image databases in the content-based image retrieval. In this paper, we put forward a method of classifying image regions hierarchically using their semantics and that resembles peoples’ perception more than using low-level features. The experiments show, the better precision of seman...

متن کامل

Using Dual Cascading Learning Frameworks for Image Indexing

To bridge the semantic gap in content-based image retrieval, detecting meaningful visual entities (e.g. faces, sky, foliage, buildings etc) in image content and classifying images into semantic categories based on trained pattern classifiers have become active research trends. In this paper, we present dual cascading learning frameworks that extract and combine intraimage and inter-class semant...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2017